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This research is a case study of an information technology (IT) solution 
company. There is a problem that is quite crucial in the hardware sales 
strategy which makes it difficult for the company to predict the number of 
various items that will be sold and also causes the excess or shortage in 
hardware stocking. This research focuses on clustering to group various of 
items and forecast the number of items in each cluster using a machine 
learning approach. The methods used in clustering are k-means clustering, 
agglomerative hierarchical clustering (AHC), and gaussian mixture models 
(GMM), and the methods used in forecasting are autoregressive integrated 
moving average (ARIMA) and recurrent neural network-long short-term 
memory (RNN-LSTM). For clustering, k-means uses two attributes, namely 
"Quantity and Stock" as the best feature in this case study. Using these 
features the k-means obtain silhouette results of 0.91 and davies bouldin 
index (DBI) values of 0.34 consisting of 3 clusters. While for forecasting, 


RNN-LSTM is the best method, where it produces more cost savings than 
the ARIMA method. The percentage of the difference in saving costs 
between ARIMA and RNN-LSTM to the actual cost is 83%. 
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1. INTRODUCTION 

Business lately has become very fast and growing. This makes companies compete with each 
other [1]. Many companies are engaged in information technology (IT) Solutions. One of the businesses 
whose development will always increase is the hardware sales business. With the help of sales in marketing 
products, both hardware and software, it really helps the company financially. But so far, the company has 
had quite a crucial problem in its hardware sales strategy which makes it difficult for the company to predict 
the number of items to be sold and also sometimes causes the company to experience excess or shortage in 
hardware inventory. 

According to [2] sales forecasting can be used for companies or other to anticipate things that will 
come. If the company does wrong in forecasting sales, things can happen that are not desirable. For example, 
the company cannot meet the sudden increase in consumer demand. Or maybe consumer demand is not in 
accordance with the company's estimates so that the existing goods are not sold. In other words, the company 
may experience excess stock of goods. This of course can bring losses to the company. This is in line with 
the opinion [3]. According to him, sales forecasting is very important to do. Sales forecasting refers to 
predicting the future by assuming factors in the past (in this case it can be data from the past) that will have 
an influence in the future. Clustering also very much needed to see hardware cluster which one is classified 
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as high, medium, or low according to the previously determined characteristics that become the reference for 
doing forecasting. In addition to using data on goods sold and existing stock, other characteristics in the data 
can also be used to analyze clusters of the hardware data sold. It aims for a promotional strategy that can be 
used by the company [4]. 

The clustering method that is most widely used is k-means clustering method [5]. K-means is an 
algorithm used in grouping which separates data into different clusters. Basically the use of this algorithm 
depends on the data obtained and the conclusions to be reached at the end of the process. K-means clustering 
is a method for performing clusters that are affected by the selection of the initial cluster centroid [6]. So that 
in the use of the k-means clustering algorithm, there are two rules. The first rule is to determine the number 
of clusters that need to be included and the second is to have attributes of type numeric because clustering 
can only process numerical data [7]. Besides k-means clustering, there is also agglomerative hierarchical 
clustering (AHC). AHC is a clustering technique that forms a hierarchy so as to form a tree structure. Thus, 
the grouping process is carried out in stages or stages. There are 2 methods in the hierarchical clustering 
algorithm, namely agglomerative (bottom-up) and divisive (top-down) [8]. In addition there are also gaussian 
mixture models (GMM). GMM is a model consisting of components of gaussian functions [9]. 

Then one method that is very popular and can be used for forecasting is using the recurrent neural 
network-long short-term memory (RNN-LSTM) method. LSTM is a method that can be used to study a 
pattern in time series data. LSTM is a type of RNN [10]. This is in line with the opinion [11] which states 
that the LSTM performs better in practice. LSTM is universal in other words LSTM provides enough 
network units to be able to calculate whether a conventional computer can calculate, as long as it has the right 
weight matrix, which can be viewed as a program. There is also another algorithm with basic time series 
data, namely ARIMA. Autoregressive Integrated Moving Average (ARIMA) is a forecasting technique that 
uses a correlation technique between a time series. Then, the model finds patterns of correlation between 
series of observations [12]. Based on the background, a research focuses on the implementation of data 
mining with machine learning methods as a solution to sales problems that often experience excess or lack of 
stock in warehouses and to find out hardware sales strategies and increase turnover in the company. 


2. RELATED WORKS 

Researchers read and conduct literature studies in journals related to the chosen research topic. It 
aims to learn all things related to research and to assist researchers in identifying research problems so that 
this will support the course of research. The summary of the related works to data mining in sales is shown in 
Table 1, the summary of the related works to clustering is shown in Table 2 and the summary of the related 
works related to forecasting is shown in Table 3. 


Table 1. Related works for data mining in sales 


References Method Process Result 
Fithri and k-means Data collection, data mining, Successfully implemented the k-means clustering 
Wardhana [13] clustering perform analysis with k-Means algorithm for the sales cluster. The results of testing 
clustering, testing with davies using DBI values produces a value of 0.2. 
bouldin index (DBI) values 
Johannes and Decision tree Cross-industry standard process | Managed to determine the prediction of the number of 
Alamsyah [14] for data mining (CRISP-DM). items sold by the viewers, the price, and the type of 
shoes. 
Soepriyanto et k-nearest Data collection, analysis, Successfully predict stock prices. Naive Bayes 
al. [15] neighbor (k- implementation, testing. produces an accuracy value of 69.38, and the k-NN 
NN) and Naive method produces an accuracy value of 67.25%. 
Bayes 
Nadeak and Ali Apriori Data mining, association, Successfully utilize artificial intelligence techniques 
[16] Apriori algorithm, testing. in drug sales. 
Edastama et al. Apriori Data cleaning, data integration, Succeeded in obtaining information on the most in- 
[17] data selection. demand items so that it can be used to increase sales 


growth and marketing of eyewear. 


Based on Table 1, it can be concluded that in the sale of goods, analysis needs to be carried out to 
assist the company in managing sales strategies and of course helping the company in increasing revenue. 
Researchers will use clustering and forecasting. In this study, researchers will use k-means clustering, AHC 
and GMM in clustering and use ARIMA and RNN-LSTM methods for forecasting. This research will 
certainly be able to increase profits for the company and can also help the company to find out the sales 
strategy for the following year. 
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Table 2. The summary of the related works in clustering 


References Method Process Result 
Puspita and k-means Determine the number of clusters, determine the Successfully implemented the k- 
Sasmita [18] clustering centroid value of each cluster, calculate the distance means algorithm in classifying tourist 
between data, and calculate the minimum object visits to the city of pagar alam to 
distance. increase visitors. 

Rani et al. [19] k-means Data collection, k-means algorithm and FP Growth Succeeded in grouping student score 
clustering, algorithm, analyzing data, implementing data. data to make it easier for students to 
and take expertise courses in the next 
frequent semester. 
pattern (FP) 
growth 

Irawan [20] k-means CRISP-DM. Successfully applied data mining 
clustering techniques with the k-means 


clustering method which aims to help 
students determine the correct course 
according to the established criteria. 
Shen et al. [21] GMM Description of the operation dataset, analysis of It can be seen that GMM can help 
heating load patterns, GMM clustering for heating analyze the timing and energy signals 
load patterns, prediction model, dan evaluation of the — of each sub-pattern. 
proposed models. 


Rashid et al. [22] k-means Adding peripheral cluster, dataset description, Successfully applied the k-means and 
Clustering, conditioning on previous frames, adding constraints, GMM methods and tested the method 
and GMM combinatorial clustering (k-means and GMM), using sparse multidimensional data 

determine K, data processing k-means and GMM. obtained from the use of video game 


sales all around the world. 


Table 3. The summary of the related works in forecasting 


References Method Process Result 

Xu et al. [23] ARIMA, and deep belief — Data collection, training data, | Successfully demonstrated that the model has 
network (DBN) prediction, testing. a high predictive accuracy and may be a 

useful tool for time series forecasting. 

Gupta et al. [24] Support vector machine Data collection, perform Successfully predict active rate, death rate, 
(SVM), prophet SVM, perform linear and cured rate in India by analyzing COVID- 
forecasting model, and regression, perform prophet 19 data. 
linear regression forecasting model, train and 

test models. 

Malki et al. [25] ARIMA Dataset description, perform The study predicts that there could be a 
ARIMA models, model second rebound of the pandemic within one 
selection, data normalization, year. Based on this research, this helps the 
experimental result, and government to act quickly. 
evaluation. 

Alabdulrazzaq et ARIMA Analysis, ARIMA Managed to apply the Arima model for the 

al. [26] parameters optimization, prediction of Covid 19 in Kuwait and get 
ARIMA model validation. precise and good accuracy. 


3. THEORY AND METHODS 

In clustering, researchers will use the k-means clustering, AHC and GMM methods. Then for 
forecasting, researchers will use the ARIMA and RNN-LSTM methods. The following is an explanation of 
the theory and methods for clustering and forecasting. 


3.1. K-means clustering 

K-means clustering is one of the techniques of clustering in the data mining modeling process 
without supervision and method of grouping data by partition. The data are grouped into several groups and 
each group has characteristics that are similar to or the same as the others but with other groups having 
different characteristics [27]. In other words, k-means clustering is a similar container of objects. If objects 
whose behavior is closer, they will be grouped in one class and those that are far or not similar are grouped in 
clusters different [27]. The clustering steps with the k-means algorithm are: i) Define K/N clusters; ii) 
Initialize the centroid randomly; iii) Find the nearest object using Euclidean distance; iv) Recalculate the data 
in each cluster to get the mean; and v) Restore data and put it back it to centroid. If the data in the cluster 
does not change, then the step cluster stops but if the center cluster is still changing, then it must return to 
number 3 until the cluster does not change anymore. Steps of k-means clustering can be seen in the process 
flow Figure 1. The formula Euclidean distance according to [28] is: 
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d(x, i) = ya ie (1) 
Where: 

d = Distance 

i = Number of data 

y = Centroid 

x = Data 


/ Determine the number of clusters Fi 


| Determine the center of the cluster | 


| Group objects based on the closest distance | 


| Recalculate centroids to get new centroids | 


Cluster Changing? 


No 


Figure |. Flow process k-means clustering 


3.2. AHC 

Agglomerative hierarchical grouping is a hierarchical grouping method with approach bottom-up. 
The grouping process starts from each data as a group, then recursively looks for the closest group as a pair 
to join as a large group. This is in line with the opinion of Krisman et al. that hierarchical clustering is a 
technique of clustering that forms a hierarchy so as to form a tree structure. Thus, the grouping process is 
carried out in stages or stages. There are 2 methods in the algorithm, hierarchical clustering namely 
Agglomerative (bottom-up) and Divisive (top-down) [29]. The steps of the AHC method are: i) Calculate 
euclidien distance; ii) Merge the two closest clusters; iii) Update the distance matrix according to the 
agglomerative clustering method. For example, using single linkage, average linkage, or complete linkage; 
iv) Repeat steps 2 and 3 to get define number of clusters; and v) Output of clustering. 


3.3. GMM 

Gaussian mixture models is a type of density model consisting of components of functions 
Gaussian [9]. GMM is applied to study the distribution parameters based on the optimal threshold that 
corresponds to the minimum calculated error probability [21]. GMM is an accurate method and the number 
of clusters is predetermined [30]. This is in line with the opinion [31] that GMM is a method that can be used 
for data clustering. GMM is a mathematical model that attempts to estimate the probability density of a data 
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distribution using a mixed finite distribution gaussian. Gaussian is the most widely used distribution. When 
GMM is used as a method clustering, then it will determine the number of clusters [32]. 

Expectation-maximization (EM) algorithm is a method used for maximum likelihood to estimate 
distribution parameters. EM algorithm is an iterative method. EM consists of two steps: expectation (E-step) 
and maximization (M-step) [33]. The following is the formula for EM Algorithm: 


(x10) = Dili my P(x19;) (2) 


The steps are: 
- Input: Training dataset 
- Initialize: T]j, 1, )}j for each j distribution function 
- Repeat: 
e E-Step: 


F PU)pRils/) 
We = PO) = aa 3) 


e M-Step (update parameter): 


Liha WiyXi ee Sika wij i= Mim A)” (4) 


ay 
jf > i-1 Wij u po 
Pj woe eed rh Wi re Wi 


- Do this until the parameters do not change 


Below is an explanation of the symbols: 


YX = Variance-covariance matrix 
Lj = Mean 

Wij = Mixture ratio 

p(x|0;) = Mixture components 

Xi = Random variable 

T = Mixture proportion 


3.4. RNN-LSTM 

RNN is a machine learning architecture that has a combination of networks in loops. Networks loop 
allow information to remain [34]. The method RNN method is known to be able to process sequential text 
data. RNN has three layers, namely input layer, output layer and hidden layer [35]. Figure 2 is an illustration 
of the RNN architecture. The LSTM method was first proven by Hochreiter and Schmidhuber in 1997. 
LSTM is a method that can be used to study a pattern in time series data. LSTM is a type of Recurrent Neural 
Network [34]. In LSTM architecture, content cells are more complex than RNN. In LSTM there are three 
gates, including input, forget and output gates. The input gate aims to enter new data, while erasing 
unimportant information contained in the forget gate and affecting the output at the same time is the task of 
the output gate. Figure 3 is an illustration of the LSTM architecture. 


3.5. ARIMA 

Autoregressive Integrated moving average is a model that ignores the variables independent as a 
whole in forecasting [36]. The ARMA model is a combination of the autoregressive (AR) and moving 
average (MA). The AR model is a method to see the movement of a variable through the variable itself while 
the MA model is used to find out the movement of a variable with its residuals in the past [37]. ARIMA is 
also known as the time series method Box Jenkins. ARIMA is well known in forecasting time series [38]. 
The following is the formula for ARIMA: 


Autoregressive (AR): 
Y; => 4 + 04 Yp-4 + 02 Y,-2 tet Oy Yip + Et (5) 
Moving Average (MA): 


Y; = u +E--— Wy Er-4- W2 Er-1—- SS Wa Et-q or Y-AR (6) 
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The symbols are described: 


Yur = Time series data in the time period ¢.1) 
81,2,» = Coefficient 
Y = True value in period t 


Context 


Hidden Layer sit see 


Wgn(a) 


Input Layer 


Figure 2. RNN architecture 


Output 


Figure 3. LSTM architecture 


4. RESEARCH METHODOLOGY 

The framework is a logical sequence to solve a research problem as outlined in a diagram flow from 
beginning to end, so that research can run systematically according to the concepts that have been made. The 
research framework for implementing data mining in clustering and forecasting will be outlined in Figure 4. 
Based on the framework in Figure 4, the first stage in the research that must be carried out is a literature 
review. Literature review is very important to identify problems and determine the objectives of the research. 
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Literature review was conducted to collect previous research journals on sales and to collect journals on 
methods of machine learning in the form of clustering and forecasting. From the literature review, the 
researcher concludes that for clustering, the researcher will use the k-means clustering, AHC and GMM 
methods where these methods are frequently used methods and produce an accurate evaluation in terms of 
clustering. Then for forecasting, the methods used are ARIMA and RNN-LSTM methods. Where the method 
is a method that is very often used and produces an accurate evaluation in terms of forecasting. After 
conducting a literature review, it is continued by identifying problems in the research and proceeding to the 
stage of data collection and analysis. 

Then proceed with modeling. In modeling, the first thing to do is clustering using k-means 
clustering, AHC and GMM. After that, evaluate with each method. After evaluating clustering, the next step 
is to determine model clustering which is the best. After that, it is continued by doing forecasting using 
ARIMA and RNN-LSTM based on the results of clustering. Then evaluate the forecasting and determine the 
model forecasting best based on the evaluation that has been done. When the best method for has been 
selected clustering and forecasting, and the research objectives have been achieved, the process is complete. 


Literature Review 


Identification of 
Problem 


Data Collection and 
Data Analysis 


| No 
v 


Clustering Model Yes 
Evaluation 


Clustering Model >> 


| No 
. Forecasting Model Yes 
Forecasting Model }=—>——— - 
Evaluation 


Figure 4. Research methodology 


5. PROPOSED METHODS 

Proposed methods aims to describe what solutions are proposed to the problems that have been 
described in the background section. There are five activities in this sub-chapter. For more details, in this 
session, the design and manufacture of solutions will be carried out which is illustrated in Figure 5. 

The first step to do is to prepare the dataset. Then do pre-processing, for example by normalizing the 
data. After the data is deemed sufficient for modeling, the next step is to do clustering using the methods 
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k-means clustering, AHC and GMM and then train the data. After that, the results will be obtained clustering 
from each method and the next step is to evaluate the model of each method. The evaluation will use 
silhouettes and DBI values. Silhouette refers to a method of validation of consistency within clusters of data. 
Then the results of the evaluation of the two methods are compared and seen which method clustering is 
better. When the results are obtained clustering best, the next step is forecasting with ARIMA and RNN- 
LSTM. The first stage in forecasting is to train the data. After that, the results will be obtained forecasting 
sales of hardware from each method. The RNN-LSTM method uses Python. The first step is to define a 
library. In this method, Scikit-learn is used. Then there is the sequential class which is part of the Keras 
library which aims to connect between layers. This method activates the LSTM and dense layer. The dense 
layer output is | neuron. The hidden layer is in the form of a 3D input layer using numpy reshape. The 
activation function uses ReLU, optimization uses Adam, and the number of epochs is 50. Then evaluate the 
model of each method and compare the evaluation results. Evaluation forecasting using root mean square 
error (RMSE), and calculate the amount of saving cost from each forecasting model. RMSE is a calculation 
between actual and predicted. RMSE which has a small value is more accurate than RMSE which has a large 
value. Then when the evaluation results are visible, the next step is to compare which method is better for 
forecasting. After comparing and choosing the method, the researcher will know which method is the best for 
clustering and forecasting. In addition, after determining the best method from the evaluation results, the 
main objective in this research will also be achieved, namely developing and implementing data mining sales 
of hardware with methods of machine learning to determine clustering of hardware sold and developing and 
implementing data mining for forecasting sales of Hardware based on clustering that has been made with 
methods of machine learning to determine stock hardware as a sales strategy for the company. 


Proposed Methods 


Pre Processing and 
Prepare Dataset | Data Analysis 


| 
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Figure 5. Flow diagram proposed methods 


6. RESULTS AND DISCUSSION 

The Following is a summary obtained from several clustering scenarios that have been carried out. 
There is some information such as Method, Number of Clustering, Attribute, Silhouette, and DBI Values. A 
summary of clustering is in Table 4. 
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Based on the results of the experiment clustering and based on the summary of Table 4, k-means 
clustering using two attributes, namely "Quantity and Stock", is the best method for the case study in this 
study. Evaluation for the k-means clustering method obtained results silhouette of 0.91 and DBI values of 
0.34 consisting of 3 clusters, namely 146 data for cluster 1, 3 data for cluster 2 and 3 data for cluster 3. And 
Table 5 is a summary of forecasting. 


Table 4. Evaluation result for clustering 


Method Number of Clustering Attribute Silhouette DBI Values 
k-means clustering 3 Quantity, Stock 0.91 0.34 
k-means clustering 3 Quantity, Stok, Price 0.80 0.37 
k-means clustering 3 Quantity, Stok, Price, Customers 0.79 0.32 
AHC 3 Quantity, Stock 0.87 0.50 
AHC 3 Quantity, Stok, Price 0.80 0.37 
AHC 3 Quantity, Stok, Price, Customers 0.79 0.28 
GMM ) Quantity, Stock 0.68 0.72 
GMM 3 Quantity, Stok, Price 0.001 1 
GMM 3 Quantity, Stok, Price, Customers 0.74 0.55 

Table 5. Evaluation result for forecasting 
Method Detail Attribute Saving Cost 
ARIMA Forecasting Cluster 1 (above 10) Date, Quantity, Stock -100% 
ARIMA Forecasting Cluster 1 (below 10) Date, Quantity, Stock 85% 
ARIMA Forecasting Cluster 2 (above 10) Date, Quantity, Stock 64% 
ARIMA Forecasting Cluster 2 (below 10) Date, Quantity, Stock 99% 
ARIMA Forecasting Cluster 3 (above 10) Date, Quantity, Stock 88% 
ARIMA Forecasting Cluster 3 (below 10) Date, Quantity, Stock 100% 
RNN-LSTM Forecasting Cluster 1 (above 10) Date, Quantity, Stock -100% 
RNN-LSTM Forecasting Cluster 1 (below 10) Date, Quantity, Stock 86% 
RNN-LSTM Forecasting Cluster 2 (above 10) Date, Quantity, Stock 64% 
RNN-LSTM Forecasting Cluster 2 (below 10) Date, Quantity, Stock 99% 
RNN-LSTM Forecasting Cluster 3 (above 10) Date, Quantity, Stock 80% 
RNN-LSTM Forecasting Cluster 3 (below 10) Date, Quantity, Stock 100% 


Based on Table 5, it can be seen the overall results of the forecasting evaluation. In addition, it is 
also known the amount of saving cost based on experiments using the ARIMA and RNN-LSTM methods, it 
can be seen that the RNN-LSTM method is better because it produces more cost savings than the ARIMA 
method. Percentage of saving cost against actual cost based on these two methods is 83%. 


7. CONCLUSION 

Clustering is done using three methods, namely k-means clustering, AHC, and GMM. In each 
method three experiments were carried out. The first experiment uses the “Quantity and Stock” attribute, the 
second experiment uses the “Quantity, Stock and Price” attribute, then the third experiment uses the 
“Quantity, Stock, Price, and Customer” attribute. The best method for clustering is k-means clustering using 
2 attributes. with a silhouette of 0.91 and DBI values of 0.34. Then, forecasting is done using two methods 
including ARIMA and RNN-LSTM. In each method, six experiments were carried out. The first experiment 
using training data and testing cluster | above 10, the second experiment uses training data and testing cluster 
1 below 10, the third experiment uses training data and testing cluster 2 above 10, the fourth experiment uses 
training data and testing cluster 2 under 10, the fifth experiment uses training data and testing cluster 3 above 
10 and the sixth experiment using training data and testing cluster 3 below 10. The best method for 
forecasting is RNN-LSTM because it produces more cost savings than the ARIMA method. Percentage of 
saving cost for ARIMA against actual cost is 83% and percentage of saving cost for RNN-LSTM against 
actual cost is 84%. The percentage of the difference in saving costs between ARIMA and RNN-LSTM to the 
actual cost is 83%. 
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